- The 25+ best Black Friday Nintendo Switch deals 2024
- Why there could be a new AI chatbot champ by the time you read this
- The 70+ best Black Friday TV deals 2024: Save up to $2,000
- This AI image generator that went viral for its realistic images gets a major upgrade
- One of the best cheap Android phones I've tested is not a Motorola or Samsung
Codex Exposed Exploring the Capabilities and Risks of OpenAIs Code Generator
While GPT-3, the general-purpose language transformer that powers Codex, has recently been opened to the public, Codex itself remains a technical preview open to a limited selection of users. Codex powers the functionality of GitHub’s Copilot, a programming assistant available as a plug-in for Visual Studio Code that is able to offer AI-powered autocomplete and code translation on the fly.
Its capabilities are still rough around the edges, but they give a good idea of what the future has in store for programmers and computer scientists — and, of course, malicious users.
If such a system is bound to become a disruptive element in the daily work of computer engineers, it is natural to wonder how this could affect the activities of cybercriminals. With this in mind, we tested the extent of Codex’s capabilities, focusing on the most typical aspects of a cybercriminal: reconnaissance, social engineering, and exploitation.
In a series of blog posts, we explore how Codex’s current capabilities affect a malicious user’s everyday activities, what precautions developers and regular users can take, and how these capabilities might evolve. This is the first part of the series.
Scavenging for sensitive data
We know that language transformers are trained on massive corpora of text and source code taken from public repositories. We are unlikely to be the first ones to ask the question of what happens to all the information contained in the public repositories once it is sifted through the fine mesh of GPT-3’s neural network. While the first issues with Copilot’s proposing snippets of copyrighted code had already emerged, we wanted to see if sensitive information was present in GPT-3’s knowledge base and if it was possible to exfiltrate it by exploiting Codex’s code generation.
Personal and sensitive information leaks through code
Public repositories can be a treasure trove of sensitive data just waiting to be discovered by malicious actors. In our tests, we found that it is possible to trick Codex into exposing sensitive data being left in the repositories by having it generate code that would eventually require access to the data.
In the examples above, we can see how, by deceitfully asking Codex to generate “code to fetch a specific dataset,” we are presented with URLs containing the information we were looking for as part of the generated code. Granted, when we checked the provided URLs, the data had been long gone, but that seemed to be more related to the data’s being outdated than to the URLs’ being fake or generated.
However, it is important to keep in mind that this issue of outdated information will likely become less of a problem in the future, as soon as language transformers such as GPT-3 are retrained more frequently to take advantage of the latest training sets available.
URLs are not the only kind of data that could be accidentally exposed by Codex. Personal information on who worked on a certain piece of code, employer information, and even cryptocurrency wallet numbers are also up for grabs. In the example above, while looking for over-the-air firmware SIM numbers, we stumbled upon the name of a developer who, while apparently not employed anymore in the relevant project, was still active in firmware development.
In another case, after we asked Codex to implement a known vulnerability, the name of a vulnerability researcher popped up in the comments of a code snippet. We looked it up and there was indeed a researcher by that name. Even though it is public information, and one can find their GitHub repository once one knows their name, having the name remembered and exposed by GPT-3 might constitute a security risk.
Credential stuffing
In addition to personal information, it is also possible to lure Codex into autocompleting credentials and API entry points. In the example below, Codex gave us FedEx and DHL credentials. These credentials might have been for testing, expired, or simply random keystrokes. It is possible that the language model generated something that just “looked right.” However, hackers can be inspired with library names and parameter names to conduct further research.
As a side note, it is interesting to remark that Codex does not really behave like a question-answering system but rather like a very opinionated autocompletion system. This is evidenced by the fact that the completion occasionally involves the request prompt itself. In the example above, the second PayPal prompt was actually filled with information on the PayPal account by Codex itself, as it was in the example “psql” command below, where the initial request prompt was simply “psql”.
Conclusion
The importance of proper data hygiene in code repositories, both public and private, has already been demonstrated in various researches that show the worrying amount of critical information, such as credentials and personal data, left unchecked in repositories all over the internet.
The examples we present here just go to show how this is not at all a marginal issue, but something that will become more and more critical for developers in the near future, as AI engines will scavenge for code to enrich their training corpora, without the capability of effectively filtering out sensitive data, given the ever-increasing size of their training sets.
It becomes essential for developers and DevOps engineers to implement dedicated processes to continuously check for such data and put in place proper techniques to share credentials in a secure way. These include encrypting sensitive data whenever possible and sharing decryption keys only through secure storages, and for data that cannot be stored securely, employing a periodic rotation model to make sure that, by the time it ends up in some training set for a language model, it is already outdated.